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-Abstract- 

We study the problem of finite-horizon probabilistic invariance for discrete-time Markov pro¬ 
cesses over general (uncountable) state spaces. We compute discrete-time, hnite-state Markov 
chains as formal abstractions of general Markov processes. Our abstraction differs from existing 
approaches in two ways. First, we exploit the structure of the underlying Markov process to 
compute the abstraction separately for each dimension. Second, we employ dynamic Bayesian 
networks (DBN) as compact representations of the abstraction. In contrast, existing approaches 
represent and store the (exponentially large) Markov chain explicitly, which leads to heavy mem¬ 
ory requirements limiting the application to models of dimension less than half, according to our 
experiments. 

We show how to construct a DBN abstraction of a Markov process satisfying an independence 
assumption on the driving process noise. We compute a guaranteed bound on the error in the 
abstraction w.r.t. the probabilistic invariance property; the dimension-dependent abstraction 
makes the error bounds more precise than existing approaches. Additionally, we show how 
factor graphs and the sum-product algorithm for DBNs can be used to solve the finite-horizon 
probabilistic invariance problem. Together, DBN-based representations and algorithms can be 
significantly more efficient than explicit representations of Markov chains for abstracting and 
model checking structured Markov processes. 

[Y] Introduction 

Markov processes over general uncountable state spaces appear in many areas of engineering 
such as power networks, transportation, biological systems, robotics, and manufacturing 
systems. The importance of this class of stochastic processes in applications has motivated a 
significant research effort into their foundations and their verification. 

We study the problem of algorithmically verifying finite-horizon probabilistic invariance 
for Markov processes, which is the problem of computing the probability that a stochastic 
process remains within a given set for a given finite time horizon. For finite-state stochastic 
processes, there is a mature theory of model checking discrete-time Markov chains [3, and a 
number of probabilistic model checking tools [UlIIl that compute explicit solutions to the 
verification problem. On the other hand, stochastic processes taking values over uncountable 
state spaces may not have explicit solutions and their numerical verification problems are 
undecidable even for simple dynamics [T]. A number of studies have therefore explored 
abstraction techniques that reduce the given stochastic process (over a general state space) to 
a finite-state process, while preserving properties in a quantitative sense BIT]. The abstracted 
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model allows the application of standard model checking techniques over finite-state models. 
The work in [T] has further shown that an explicit error can be attached to the abstraction. 
This error is computed purely based on continuity properties of the concrete Markov process. 
Properties proved on the finite-state abstraction can be used to reason about properties of 
the original system. The overall approach has been extended to linear temporal specifications 
[55] and software tools have been developed to automate the abstraction procedure |S] . 

In previous works, the structure of the underlying Markov process (namely, the interdepen¬ 
dence among its variables) has not been actively reflected in the abstraction algorithms, and 
the finite-state Markov chain has been always represented explicitly, which is quite expensive 
in terms of memory requirements. In many applications, the dynamics of the Markov process, 
which are characterized by a conditional kernel, often exhibit specific structural properties. 
More specifically, the dynamics of any state variable depends on only a small number of other 
state variables and the process noise driving each state variable is assumed to be independent. 
Examples of such structured systems are models of power grids and sensor-actuator networks 
as large-scale interconnected networks |22j and mass-spring-damper systems mm- 

We present an abstraction and model checking algorithm for discrete-time stochastic 
dynamical systems over general (uncountable) state spaces. Our abstraction constructs 
a finite-state Markov abstraction of the process, but differs from previous work in that 
it is based on a dimension-dependent partitioning of the state space. Additionally, we 
perform a precise dimension-dependent analysis of the error introduced by the abstraction, 
and our error bounds can be exponentially smaller than the general bounds obtained in 
[T|. Furthermore, we represent the abstraction as a dynamic Bayesian network (DBN) [14] 
instead of explicitly representing the probabilistic transition matrix. The Bayesian network 
representation uses independence assumptions in the model to provide potentially polynomial 
sized representations (in the number of dimensions) for the Markov chain abstraction for 
which the explicit transition matrix is exponential in the dimension. We show how factor 
graphs and the sum-product algorithm, developed for belief propagation in Bayesian networks, 
can be used to model check probabilistic invariance properties without constructing the 
transition matrix. Overall, our approach leads to significant reduction in computational and 
memory resources for model checking structured Markov processes and provides tighter error 
bounds. 

The material is organized in six sections. Section [^defines discrete-time Markov processes 
and the probabilistic invariance problem. Sectionj^presents a new algorithm for abstracting a 
process to a DBN, together with the quantification of the abstraction error. We discuss efficient 
model checking of the constructed DBN in Section and apply the overall abstraction 
algorithm to a case study in Section Section outlines some further directions of 
investigation. Proofs of statements are included in the Appendix. 


\~2] Markov Processes and Probabilistic Invariance 
2.1 Discrete-Time Markov Processes 

We write N for the non-negative integers N = {0,1,2,...} and N„ = {1,2,..., n}. We use 
bold typeset for vectors and normal typeset for one-dimensional quantities. 

We consider a discrete-time Markov process defined over a general state space, and 
characterized by the tuple S is the continuous state space, which we assume to 


S. Esmaeil Zadeh Soudjani and A. Abate and R. Majumdar 


3 


be endowed with a metric and to be separably B is the Borel tr-algebra associated to 5, 
which is the smallest cr-algebra containing all open subsets of S\ and : 5 x S —> [0,1] is a 
stochastic kernel, so that is a non-negative measurable function for any set B G B, 

and T^(s,-) is a probability measure on (S,B) for any s G S. Trajectories (also called traces 
or paths) of are sequences (s(0), s(l), s(2),...) which belong to the set = 5”. The 
product cr-algebra on is denoted by B. Given the initial state s(0) = Sq G 5 of the 
stochastic Kernel Tg induces a unique probability measure V on (fl, J-) that satisfies the 
Markov property: namely for any measurable set B G B and any t G N 

r{s{t+l)G B|a(0),a(l),...,a(t)) = 7^(a(t+l) G B\s{t)) = Tg{s{t), B). 

We assume that the stochastic kernel Tg admits a density function tg : S x S ^ R>o, such 
that Ts(s,i?) = Jgtg{s\s)ds. 

A familiar class of discrete-time Markov processes is that of stochastic dynamical systems. 
If {C{t)j t G N} is a sequence of independent and identically distributed (iid) random variables 
taking values in K”, and f : S x M" —>■ 5 is a measurable map, then the recursive equation 

s{t+l) = f{s{t),C{t)), Vt G N, s(0) = So G 5, (1) 

induces a Markov process that is characterized by the kernel 

Tg{s,B)=Tc{CGR^ : /(s,C)gB), 

where is the distribution of the r.v. ((^(0) (in fact, of any ^(t) since these are iid random 
variables). In other words, the map f together with the distribution of the r.v. {C(i)} 
uniquely define the stochastic kernel of the process. The converse is also true as shown in m 
Proposition 7.6]: any discrete-time Markov process admits a dynamical representation 
as in for an appropriate selection of function f and distribution of the r.v. {((^(t)}. 

Let us expand the dynamical equation 0 explicitly over its states s = [si,..., map 
components / = [fi, ■ ■ ■, fn]'^, and uncertainly terms C = [Cii ■ • • j Cn]^) as follows: 

si(f+ 1) = 

S2(t -I- 1) = f2{Sl{t),S2{t), . . . , Sn{t)X2{t)), 

: ( 2 ) 

Sn{t + 1) = fn{si{t),S 2 {t), S„ (<) , Cn (i)) ■ 

In this article we are interested in exploiting the knowledge of the structure of the dynamics 
in ([^ for formal verification via abstractions [nmiH]. We focus our attention to continuous 
(unbounded and uncountable) Euclidean spaces 5 = K", and further assume that for any 
< G N, Ck{t) are independent for all /c G N„. This latter assumption is widely used in the 
theory of dynamical systems, and allows for the following multiplicative structure on the 
conditional density function of the process: 

tg{s\s) = ti{si\s)t 2 {s 2 \s) . ..tn{Sn\s), (3) 

where the function : K" x M —> IR>o solely depends on the map fk and the distribution of 
Cfe. The reader is referred to Section for the detailed computation of the functions from 
the dynamical equations in Q. 
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A metric space S is called separable if it has a countable dense subset. 
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M Figure 1 n-body mass-spring-damper system. 


► Remark. The results of this article are presented under the structural assumption that 

Cfc(-) are independent over fc G N„. These results can be generalized to a broader class of 
processes by allowing inter-dependencies between the entries of the process noise, which 
requires partitioning the set of entries of so that any two entries from different partition 
sets are independent, whereas entries within a partition set may still be dependent. This 
assumption induces a multiplicative structure on with respect to the partition, which 

is similar to The finer the partition, the more efhcient is our abstraction process. 

► Example 1. Figure [l] shows a system of n masses connected by springs and dampers. For 

i G N„, block i has mass rrii, the i**' spring has stiffness ki, and the damper has damping 
coefficient bi. The first mass is connected to a fixed wall by the left-most spring/damper 
connection. All other masses are connected to the previous mass with a spring and a damper. 
A force Q is applied to each mass, modeling the effect of a disturbance or of process noise. 
The dynamics of the overall system is comprised of the position and velocity of the blocks. It 
can be shown that the dynamics in discrete time take the form s{t+ 1) = $s(t) + C{t), where 
s{t) G with S 2 i-i{t), S 2 i{t) indicating the velocity and position of mass i. The state 
transition matrix $ = G IU 2 "x 2 n jg ^ band matrix with lower and upper bandwidth 3 

and 2, respectively (4)^ = 0 for j < z — 3 and for j > z -|- 2). ◄ 

► Example 2. A second example of structured dynamical systems is a discrete-time large- 
scale interconnected system. Consider an interconnected system of Nj, heterogeneous linear 
time-invariant (LTI) subsystems described by the following stochastic difference equations: 

Si{t -h 1) = GijSj(t) + BiUi{t) + 

J&Ni 

where z G Nat^ denotes the z*^*' subsystem and Si G G G are the 

state, the input, and the process noise of subsystem z. The term represents 

the physical interconnection between the subsystems where Ni, |iVi| <C Nj,, is the set of 
subsystems to which system z is physically connected. The described interconnected system 
can be found in many application areas including smart power grids, traffic systems, and 
sensor-actuator networks m- ^ 

2.2 Probabilistic Invariance 

We focus on verifying probabilistic invariance, which plays a central role in verifying properties 
of a system expressed as PCTL formulae or as linear temporal specifications Elm US]- 

► Definition 3 (Probabilistic Invariance). Consider a bounded Borel set A G B, representing 
a set of safe states. The finite-horizon probabilistic invariance problem asks to compute the 
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probability that a trajectory of associated with an initial condition Sq remains within 
the set A during the finite time horizon JV: 

Piv(so, A) = 'P{s{t) G A for alH = 0,1,2,..., A^|s(0) = sq}. 

This quantity allows us to extend the result to a general probability distribution tt : B —>■ 
[0,1] for the initial state s(0) of the system as 

T’{s(t) e A for all t = 0,1,2, ■■■, IV} = f pn{so, A)Tr{dso). (4) 

Js 

Solution of the probabilistic invariance problem can be characterized via the value functions 
Vk : 5 —>■ [0,1], fc = 0,1, 2,..., iV, defined by the following Bellman backward recursion [T]: 

Vfc(s) = 1 a(s) f Vk+i{s)t^{s\s)ds for A: = 0,1,2,..., TV - 1. (5) 

Ja 

This recursion is initialized with V]\[{s) = l^(s), where 1 a(s) is the indicator function which 
is 1 if s G A and 0 otherwise, and results in the solution pjv(^O)^) = ^(sq)- 

Equation characterizes the finite-horizon probabilistic invariance quantity as the 
solution of a dynamic programming problem. However, since its explicit solution is in 
general not available, the actual computation of the quantity P]\[{sq, A) requires N numerical 
integrations at each state in the set A. This is usually performed with techniques based on 
state-space discretization [^. 

fs"! Formal Abstractions as Dynamic Bayesian Networks 
3.1 Dynamic Bayesian Networks 

A Bayesian network (BN) is a tuple *8 = (V,£’,T). The pair (V,£) is a directed Acyclic 
Graph (DAG) representing the structure of the network. The nodes in V are (discrete or 
continuous) random variables and the arcs in £ represent the dependence relationships among 
the random variables. The set T contains conditional probability distributions (GPD) in 
forms of tables or density functions for discrete and continuous random variables, respectively. 
In a BN, knowledge is represented in two ways: qualitatively, as dependences between 
variables by means of the DAG; and quantitatively, as conditional probability distributions 
attached to the dependence relationships. Each random variable G V is associated with a 
conditional probability distribution F{Xi\Pa{Xi)), where PaiY) represents the parent set of 
the variable E G V: PaiY) = {X G V|(A, E) G £}. A BN is called two-layered if the set of 
nodes V can be partitioned to two sets Vi, V 2 with the same cardinality such that only the 
nodes in the second layer V 2 have an associated GPD. 

A dynamic Bayesian network dmis] is a way to extend Bayesian networks to model 
probability distributions over collections of random variables A(0), A(l), A(2),... indexed 
by time t. A DBP^is defined to be a pair (Sq, ®-!.), where iBo is a BN which defines the 
distribution of A(0), and *8^. is a two-layered BN that defines the transition probability 
distribution for {X{t-\- l)|A(t)). 


2 


The DBNs considered in this paper are stationary (the structure of the network does not change with 
the time index t). They have no input variables and are fully observable: the output of the DBN model 
equals to its state. 
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3.2 DBNs as Representations of Markov Processes 

We now show that any discrete-time Markov process over K" can be represented as a 
DBN (*Bo, over n continuous random variables. The advantage of the reformulation is 
that it makes the dependencies between random variables explicit. 

The BN *Bo is trivial for a given initial state of the Markov process s(0) = Sq- The 
DAG of fBo has the set of nodes {Xi, X2, ■ ■ ■ ,Xn} without any arc. The Dirac delta 
distribution located in the initial state of the process is assigned to each node of $o(f] The 
DAG for the two-layered BN *B_>. = (V,£,T) comprises a set of nodes V = Vi U V 2 , with 
Vi = {Xi,X 2 ,..., Xn} and V 2 = {Xi,X 2 , ■ ■ ■, A„}. Each arc in E connects a node in Vi to 
another node in V 2 ; (Xi,Xj) £ £ \i and only if tj{sj\s) is not a constant function of Si. The 
set T assigns a GPD to each node Xj according to the density function 

► Example 4. Gonsider the following stochastic linear dynamical system: 

s{t -I- 1) = <i)s(t) + C{t) t GN, s(0) — Sq = [sqi, S 02 , • ■ ■ , Son]^) (6) 


where = [aij]ij is the system matrix and C(^) ~ -^(0,5]) are independent Gaussian 
r.v. for any t G N. The covariance matrix S is assumed to be full rank. Gonsequently, a 
linear transformation can be employed to change the coordinates and obtain a stochastic 
linear system with a diagonal covariance matrix. Then without loss of generality we assume 
E = diag{[al, cr|,..., cr^]), which clearly satisfies the independence assumption on the process 
noise raised in Section 2.1 Model (|^ for a lower bidiagonal matrix <i> can be expanded as 
follows: 


Si{t -I- 1) = aiiSi(t) -I- Ci(i) 

S2{t -I- 1) = a2lSi(f) -I- a22S2(^) + C2{t) 

S3{t - I - 1 ) = a 32 S 2 {t) + 03383(1) + (3(1) 


Sn{t -|- 1 ) — -|- -|- Cn{t), 

where Ci(’)) ^ ^ are independent Gaussian r.v. JV(0,af). The conditional density function 
of the system takes the following form: 

G(a|s) = tl(si|si)t2(s2|si,S2)t3(S3|s2,S3) . . . (s„ | S„_i, S„) . 

The DAG of the two-layered BN !B_>. associated with this system is sketched in Figure]^ for 
n = 4. The BN has an empty graph on the set of nodes {Xi,..., X„} with the associated 
Dirac delta density functions located at soi, dct(si(0) — Soi)- ◄ 

3.3 Finite Abstraction of Markov Processes as Discrete DBNs 

Let A G B be a bounded Borel set of safe states. We abstract the structured Markov process 
interpreted in the previous section as a DBN with continuous variables to a DBN with 
discrete random variables. Our abstraction is relative to the set A. Algorithm [l] provides the 


For a general initial probability distribution tt : S ^ [0, l]i a set of arcs must be added to reflect its 
possible product structure. This construction is not important at the current stage becau se of the 
backward recursion formulation of the probabilistic safety (please refer to 0 in Section [ 2 .2[). 
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M Figure 2 Two-layered BN associated with the stochastic linear dynamical system in (|^ for 
n = 4. 


Algorithm 1 Abstraction of model as a DBN with f8_>. = {V,E,T) over discrete r.v. 


Require: input model safe set A 


Project safe set A in each dimension Di = 11^(A), i G 
Select finite Ui-dimensional partition of Di as Di = A^E^Dij, i G N„ 

For each Dij, select single representative point Zij G Dij, Zij — ^i{Dij) 

Construct the DAG (V,f), with V = {Xi,Xi,i G N„} and £ as per Section 
Define Zi = {zn,..., Zim}, i G N„, and take Eli = ZiU {</>z} as the finite state space of 
two r.v. Xi and Xi, (j)i being dummy variables as per Section 


3.2 


3.3 


6: Compute elements of the set T, namely CPD Ti related to the node z £ Ni, as 


r,(A, = z\v{Pa{X,))) 


SEi(z) 

1- L ti(silv(Pa(Xi)))dsi, 

< zdZi ' 

1 , 


2 G Zi, v(Pa(Xi)) n (/) = 0 
z = (j)i, v{Pa{Xi)) r)(j) = 0 

z = 4>i, v{Pa{X^)) r\cj)^0 
z G Zi, v{Pa{Xi)) n ^ yf 0 


Ensure: output DBN with f8_>. = {V,£,T) over discrete r.v. 


steps of the abstraction procedure. It consists of discretizing each dimension into a finite 
number of bins. 

In Algorithm]^ the projection operators 11^ : R" —> K, z € N„, are defined as 11^(8) = Si for 
any s = [si,..., G R". These operators are used to project the safe set A over different 
dimensions, Di = 11^(A). In step [2 of the Algorithm, set Di is partitioned as {Dij}'^'L^ 
(for any i G N„, D^j’s are arbitrary but non-empty, non-intersecting, and Di = 

The corresponding representative points G Dij are also chosen arbitrarily. Step of the 
algorithm constructs the support of the random variables in V = {Xi,Xi,i G N„}, 
and stepcomputes the discrete CPDs Ti{Xi\Pa{Xi)), reflecting the dependencies among 
the variables. For any z G Nji, —>• 2^* represents a set-valued map that associates 

to any point Zij G Zi the corresponding partition set Dij C Di (this is known as the 
“refinement map”). Furthermore, the abstraction map ^i : Di ^ Zi associates to any point 
Si G Di the corresponding discrete state in Zi. Additionally, notice that the absorbing states 
(j) = {(j)i,... ,(j)n} are added to the definition of BN so that the conditional probabilities 
Ti{Xi\Pa{Xi)) marginalize to one. The function v{-) used in stepj^acts on (possibly a set 
of) random variables and provides their instantiation. In other words, the term v{Pa{Xi)) 
that is present in the conditioned argument of ti leads to evaluate function ti{si\-) at the 
instantiated values of Pa{Xi). 

The construction of the DBN with discrete r.v. in Algorithm [T] is closely related to the 
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Markov chain abstraction method in Di. The main difference lies in partitioning in each 
dimension separately instead of doing it for the whole state space. Absorbing states are 
also assigned to each dimension separately instead of having only one for the unsafe set. 
Moreover, Algorithm stores the transition probabilities efficiently as a BN. 

3.4 Probabilistic Invariance for tbe Abstract DBN 

We extend the use of P by denoting the probability measure on the set of events defined 
over a DBN with discrete r.v. 2 ; = (Ai, X 2 ,..., X„). Given a discrete set Za C Hi ^*1 
probabilistic invariance problem asks to evaluate the probability pn{zq, Za) that a finite 
execution associated with the initial condition z{0) = Zq remains within the set Za during 
the finite time horizon t = 0,1,2,N. Formally, 

Pn{zo, Za) = F{z{t) e Za, for alH = 0,1, 2,..., A|2:(0) = Zq). 

This probability can be computed by a discrete analogue of the Bellman backward recursion 
(see [2] for details). 

► Theorem 5. Consider value functions Vff : —>■ [0,1], fc = 0,1,2,..., N, computed by 

the backward recursion 

V^{z) = lzS^) Y. A: = 0,1,2,...,A-1, (7) 

and initialized with V^{z) = lz^{z). Then the solution of the invariance problem is charac- 
terized as pn{zo, Za) = Vf^^z^). 

The discrete transition probabilities F{z\z) in Equation 0 are computed by taking the 
product of the CPD in T- More specifically, for any z,z G Hi of the form 2 ; = 
(zi,Z 2 ,...,z„),z = (zi,Z 2 ,...,z„) we have 

F{z\z) = = z,\Pa{X,) = z). 

i 

Our algorithm for probabilistic invariance computes pn{zq, Za) to approximate pn{sq, A), 
for suitable choices of Zq and Za depending on Sq and A. The natural choice for the initial 
state is Zq = (zi(0),..., 2„(0)) with 2^(0) = ^i(ni(so)). For A, the n-fold Cartesian product 
of the collection of the partition sets {D^ }, i G N„ generates a cover of A as 

^ ^ X X ... X {Dr.,}]=l 

~ 1^ “ (jl) J2) ■ ■ • ) Jri)) Dj = Dij^ X D2j2 X ... X ■ 

3 

We define the safe set of the DBN as 

== [J{{ziji,Z 2 j 2 ,---,ZnjJ, such that An Dj ^ 0 for j = {jl, j 2 , ■ ■ ■ , jn)} , (8) 

3 

which is a discrete representation of the continuous set A C 
^ = U J = 0'lfo2, ■ ■ ■ , jn), A n Dj ^ 0} . 


3 


( 9 ) 
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For instance A can be a finite union of hypercubes in M” if the partition sets Dij are intervals. 
It is clear that the set A is in general different form A. 

There are thus two sources of error: first due to replacing A with A, and second, due to 
the abstraction of the dynamics between the discrete outcome obtained by Theorem and 
the continuous solution that results from (|^. In the next section we provide a quantitative 
bound on the two sources of error. 

3.5 Quantification of the Error due to Abstraction 

Let us explicitly write the Bellman recursion ([^ of the safety problem over the set A: 

W^(s) = l^(s), Wkis)= f Wk+i{s%{s\s)ds, fc = 0,l,2,...,IV-l, (10) 

JA 

which results in prs[{sQ,A) = ITo(so). Theoremcharacterizes the error due to replacing the 
safe set Ahj A. 

► Theorem 6. Solution of the probabilistic invariance problem with the time horizon N and 
two safe sets A, A satisfies the inequality 

bAr(so,^) -Pjv(so,-4)| < MNC{AAA), Vsq gAhA, 

where M = sup {ts(s|s)|s, s G A/S.A^. C{B) denotes the Lebesgue measure of any set B G B 
and AAA = (A\A) U (^\^) is the symmetric difference of the two sets A, A. 

The second contribution to the error is related to the discretization of Algorithm 
which is quantified by posing regularity conditions on the dynamics of the process. The 
following Lipschitz continuity assumption restricts the generality of the density functions tk 
characterizing the dynamics of model 

► Assumption 1. Assume the density functions tfe(si|-) are Lipschitz continuous with the 
finite positive dij 

~ )l — diij\si — Sj|, 

with s = [si,... ... ,s„] and s' = [si,..., Si_i, s', s^+i,... ,s„], for all Sfe,s),,Sfc e 

Dk, k G N„, and for all i,j G N„. 

Note that Assumptionholds with dij = 0 if and only if {Xi,Xj) ^ £ in the DAG of the 
BN *8_i.. Assumption [2 enables us to assign non-zero weights to the arcs of the graph and 
turn it into a weighted DAG. The non-zero weight Wij = dijC(Dj) is assigned to the arc 
{Xi,Xj) G £, for all i,j G N„. We define the out-weight of the node Xi by Oi = 
and the in-weight of the node Xj by Ij 

► Remark. The above assumption implies Lipschitz continuity of the conditional density 
functions tj(sj|s). Since trivially |si — s'| < ||s — s'|| for all i G N„, we obtain 

|tj(sj|s) — tj(sj|s')| < 'HjWs — s'll Vs, s' G A, Sj G Dj, 

where Bj = density function ts(s|s) is also Lipschitz continuous if the density 

functions tj(sj|s) are bounded, but the boundedness assumption is not necessary for our 
result to hold. 

Assumption [T] enables us to establish Lipschitz continuity of the value functions Wk in 
( |10[ ). This continuity property is essential in proving an upper bound on the discretization 
error of Algorithm [T] which is presented in Gorollary 
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► Lemma 7. Consider the value functions Wk(‘), k = 0,1,2,, N, employed in Bellman 
recursion (10) of the safety problem over the set A. Under Assumption^ these value 
functions are Lipschitz continuous 

\Wk{s) - Wk{s')\ < k||s - s'll, Vs, s' e A, 


for all k = 0,1,2,... ,N with the constant n = where Ij is the in-weight of the 

node Xj in the DAG of the BN 

► Corollary 8 . The following inequality holds under Assumption^ 

\ pn { so , A) - pn { zo , Za)\ < MNC{AAA) + NkS Vsq S A, 

where piq{zQ, Za) is the invariance probability for the DBN obtained by Algorithm^ The 
initial state of the DBN is Zq = (zi(0),..., 2:„(0)) with Zi{0) = ^i(ni(so)). The set Za and 
the constant M are defined in ([^ and Theorem respectively. The diameter of the partition 
of Algorithm^is defined and used as S = sup{||s — s'||, Vs, s' G Dj,yj Dj C A}. 

The second error term in Corollary is a linear function of the partition diameter 5, 
which depends on all partition sets along different dimensions. We are interested in proving 
a dimension-dependent error bound in order to parallelize the whole abstraction procedure 
along different dimensions. The next theorem gives this dimension-dependent error bound. 

► Theorem 9. The following inequality holds under Assumption^ 

n 

\pN{so,A)-pN{zo,Za)\<MNC{AAA)-^NY,O^S, Vsq G A, (11) 

with the constants defined in Corollary^ Oj is the out-weight of the node Xi in the DAG of 
the BN 25. The quantity 6i is the maximum diameter of the partition sets along the 
dimension Si = sup{|si — s'|,Vsi,s' G Dij,\/j G N„;}. 

For a given error threshold e, we can select the set A and consequently the diameters Si such 
that MNC{AAA) -\- NJ27=i ^ Therefore, generation of the abstract DBN, namely 
selection of the partition sets {Dij, j G Ni} (according to the diameter Si) and computation 
of the CPD, can be implemented in parallel. For a given e and set A, the cardinality of the 
state space Di,i G N^, of the discrete random variable Xi and thus the size of the CPD Ti, 
grow linearly as a function of the horizon of the specification N. 


Efficient Model Checking of the Finite-State DBN 

Existing numerical methods for model checking DBNs with discrete r.v. transform the 
DBN into an explicit matrix representation mmaiiD], which defeats the purpose of a 
compact representation. Instead, we show that the multiplicative structure of the transition 
probability matrix can be incorporated in the computation which makes the construction of 
¥{z\z) dispensable. For this purpose we employ factor graphs and the sum-product algorithm 
HU originally developed for marginalizing functions and applied to belief propagation in 
Bayesian networks. Suppose that a global function is given as a product of local functions, 
and that each local function depends on a subset of the variables of the global map. In its 
most general form, the sum-product algorithm acts on factor graphs in order to marginalize 
the global function, i.e., taking summation respect to a subset of variables, exploiting its 
product structure m- In our problem, we restrict the summation domain of the Bellman 
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M Figure 4 Spanning tree of the linear stochastic system 
■ Figure 3 Factor graph of the linear in ([1 forn = 4 and two orderings (^ 4 , ^ 3 , ^ 2 , z^) (top 
stochastic system (§ for n = 4. pl^^^ 


recursion Q to J([j Zi because the value functions are simply equal to zero in the complement 
of this set. The summand in Q has the multiplicative structure 

g{z,z) = lzAz)Vk+i{z)'[[T,{Xi = z,\Pa{X,) = z), V^{z) = ^ g{z,z). (12) 

The function g{z,z) depends on variables {zi,Zi, i S N„}. The factor graph of g{z,z) 
has 2 n variable nodes, one for each variable and (n + 2 ) funetion nodes for local functions 
yk+n'^i- connects a variable node to a function node if and only if the variable 

is an argument of the local function. The factor graph of Example]^ for n = 4 is presented 
in Figure]^- factor graphs of general functions g{z, z) in ( [T^ are similar to that in Figure 
the only part needing to be modified being the set of arcs connecting variable nodes 
{zi, i G N„} and function nodes {T^, i G N„}. This part of the graph can be obtained from 
the DAG of of the DBN. 

The factor graph of a function g(z, z) contains loops for n>2 and must be transformed 
to a spanning tree using clustering and stretching transformations m- For this purpose 
the order of clustering function nodes G N„} and that of stretching variable nodes 

{zi, i G Nn} needs to be chosen. Figure presents the spanning trees of the stochastic 
system in for two such orderings. The variable nodes at the bottom of each spanning 
tree specify the order of the summation, whereas the function nodes considered from the 
left to the right indicate the order of multiplication of the local functions. The rest of the 
variable nodes show the arguments of the intermediate functions, which reflects the required 
memory for storing such functions. The computational complexity of the solution carried 
out on the spanning tree clearly depends on this ordering. 

Algorithm presents a greedy procedure that operates on the factor graph and provides 
an ordering of the variables and of the functions, in order to reduce the overall memory usage. 
This algorithm iteratively combines the function nodes and selects the next variable node, 
over which the summation is carried out. The output of this algorithm implemented on 
the factor graph of Example]^ is the orderings Kf = (z 4 , Z 3 , Z 2 , zi) and ej = (T 4 , T 3 , T 2 , Ti), 
started from the outermost sum, which is related to the spanning tree on top of Figure ^ 

[~5l Comparison with the State of the Art 

In this section we compare our approach with the state-of-the-art abstraction procedure 
presented in [T] (referred to as AKLP in the following), which does not exploit the structure 
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Algorithm 2 Greedy algorithm for obtaining the order of stretching variables and clustering 
functions in the factor graph 


Require: Factor graph of the summand in Bellman recursion 
1 : Initialize the sets U\ = {zj, i € N„}, U 2 = {zi, i G Nn}, U 3 = {Tj, i G N„}, ej = kj = 0 

2: while Ui do 

3: For any node u€U^ compute Pa^(u) (resp. Ch^{u)) as the elements of Ui (resp. U 2 ) 

connected to u by an arc in the factor graph 
4: Define the equivalence relation R on U 3 as uRu iff Paf{u) = Paf{u) 

5: Replace the set U 3 with the set of equivalence classes induced by R. 

6 : Combine all the variable nodes of Chf{u) connected to one class 

7: Select u € U 3 with the minimum cardinality of Paf{u) and put ej = (u, ef),Kf = 

{Chf{u),K^) 

8 : Update the sets Ui = Ui\Paf{u), U 2 = ^^2 U Pa^{u)\Ch^{u), LI 3 = Z/f 3 \{u}, and 

eliminate all the arcs connected to u 

9: end while 

Ensure: The order of variables Kf and functions ej 


of the dynamics. The AKLP algorithm approximates the concrete model with a finite-state 
Markov chain by uniformly gridding the safe set. As in our work, the error bound of the 
AKLP procedure depends on the global Lipschitz constant of the density function of the 
model, however it does not exploit its structure as proposed in this work. We compare the 
two procedures on ( 1 ) error bounds and ( 2 ) computational resources. 

Consider the stochastic linear dynamical model in ([^, where <I> = [ciijjij is an arbitrary 
matrix. The Lipschitz constants dij in Assumption can be computed as dij = |aji|/cr|-\/27re, 
where e is Euler’s constant. From Theorem]^ we get the following error bound: 


cdbn = MNjO,{AAA) -f 


N 


V 

\/27re 



C{D,)S,. 


On the other hand, the error bound for AKLP is 

Ae-1/2 


CAKLP = M N C{AAA) 




aia2 ... cr„ 




In order to meaningfully compare the two error bounds, select set A = [—a, a]” and 
ai — a,i ^ N„, and consider hypercubes as partition sets. The two error terms then become 


cdbn = ‘^nrj 


I'&lli 


nwn 


CAKLP = ‘7?7"'||d>||2, TJ = 


2 a 


NS 


c = 


crve 


where ||d>||i and ||d >||2 are the entry-wise one-norm and the induced two-norm of matrix $, 
respectively. The error caklp depends exponentially on the dimension n as rf', whereas we 
have reduced this term to a linear one [nr]) in our proposed new approach resulting in error 
eoBN- Note that rj < 1 means that the standard deviation of the process noise is larger than 
the selected safe set: in this case the value functions (which characterize the probabilistic 
invariance problem) uniformly converge to zero with rate 77 "; clearly the case of 77 > 1 is 
more interesting. On the other hand for any matrix $ we have < ||‘I’|| 2 . This second 
term indicates how sparsity is reflected in the error computation. Denote by r the degree of 
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M Table 1 Comparison of the AKLP and the DBN-based algorithms, over the stochastic linear 
dynamical model in Q. The number of partition sets (or bins) per dimension, the number of 
marginals, and the total required number of (addition and multiplication) operations for the 
verification step, are compared for models of different dimensions (number of continuous variables 
n). 


dimension n 

1 

2 

3 

4 

5 

6 

7 

8 

# bins/dim 

AKLP 

DBN 

1.2 x 10^ 
1.2 x 10^ 

1.1 x lO'' 
3.6 x 10® 

6.0 x 10® 
6.0 x 10® 

2.9 x 10® 
8.5 x 10® 

1.3 x 10® 
1.1 x 10® 

5.8 x 10® 
1.3 x 10® 

2.5 x 10’’ 

1.6 x 10® 

1.1 x 10® 
1.8 x 10® 

# marginals 

AKLP 

DBN 

1.5 x 10® 
1.5 x 10® 

1.5 x 10®® 
4.8 x 10®“ 

4.8 x 10®® 
4.4 x 10®® 

4.8 x 10“®® 

1.8 x 10®® 

1.5 x 10®® 
5.2 x 10®® 

1.5 x 10®® 
1.2 x 10®® 

4.3 x 10®“® 

2.3 x 10®® 

3.5 x 10®®® 
4.2 x 10®® 

# operations 

AKLP 

DBN 

2.9 x 10’’ 
2.9 x 10’’ 

3.1 x 10®’’ 
1.9 x 10®® 

1.0 x 10®“ 
8.0 x 10®“ 

1.1 x 10®® 
3.5 x 10®® 

3.7 x 10®® 

1.7 x 10®® 

3.7 x 10®® 
8.9 x 10®“ 

1.1 x 10®“® 

5.2 x 10®® 

9.5 x 10®®“ 
3.4 x 10®“ 


connectivity of the DAG of S-j. for this linear system, which is the maximum number of 
non-zero elements in rows of matrix $. We apply Lemma in the Appendix to matrix $ to 
get the inequalities 

_ ll^lli ^ 

||d>||2 < vurmax loiJ, —^<^max|aiJ, 

i,j riy/n yn hi 

which shows that for a fixed dimension n, sparse dynamics, compared to fully connected 
dynamics, results in better error bounds in the new approach. 

In order to compare computational resources, consider the numerical values N = 10, 
a = 1, a — 0.2, and the error threshold e = 0.2 for the lower bidiagonal matrix $ with all 
the non-zero entries set to one. Table [l] compares the number of required partition sets 
(or bins) per dimension, the number of marginals, and the required number of (addition 
and multiplication) operations for the verification step, for models of different dimensions 
(number of continuous variables n). The numerical values in Table confirm that for a given 
upper bound on the error e, the number of bins per dimension and the required marginals 
grow exponentially in dimension for AKLP and polynomially for our DBN-based approach. 
For instance, to ensure the error is at most e for the model of dimension n = 4, the cardinality 
of the partition of each dimension for the uniform gridding and for the structured approach is 
2.9 X 10® and 8.5 x 10®, respectively. Then, AKLP requires storing 4.8 x 10^® entries (which 
is infeasible!), whereas the DBN approach requires 1.8 x 10^^ entries (~ 8GB). The number 
of operations required for computation of the safety probability are 1.1 x 10"^® and 3.5 x 10^®, 
respectively. This shows a substantial reduction in memory usage and computational time 
effort: with given memory and computational resources, the DBN-based approach in compare 
with AKLP promises to handle systems with dimension that is at least twice as large. 


6 


Conclusions and Future Directions 


While we have focused on probabilistic invariance, our abstraction approach can be extended 
to more general properties expressed within the bounded-horizon fragment of PCTL m or 
to bounded-horizon linear temporal properties [231111], since the model checking problem for 
these logics reduce to computations of value functions similar to the Bellman recursion scheme. 
Our focus in this paper has been the foundations of DBN-based abstraction for general 
Markov processes: factored representations, error bounds, and algorithms. We are currently 
implementing these algorithms in the FAUST^ tool [5], and scaling the algorithms using 
dimension-dependent adaptive gridding [5] as well as implementations of the sum-product 
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algorithm on top of data structures such as algebraic decision diagrams (as in probabilistic 

model checkers ini)- 
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I A I Proof of Statements 

Proof of Theorem [6l Recall the recursive equations for the probabilistic safety problem 
over sets A and A as in © and ( |I0| , respectively. The solutions of the safety problems 
are pn{so,A) = Vb(^o) and pn{sq,A) = Wo(ao)- We prove inductively that the inequality 
|t4(s) — II4(s)| < M{N — k)C{A/S.A) holds for all s G A n A. This inequality is true for 
k = N. For any fc = 0,1,2,..., — 1 and any state s G A n A we have 

\Vk{s) - Wk{s)\ < [ |Vfc+i(s) - IFfe+i(s)|ts(s|s)ds 
JAnA 

-b / Vk+i{s)tAs\s)ds + / VFfc+i(s)ts(s|s)ds 
Ja\a Ja\a 

< M{N -k- 1 )/:(AAA) -f ML{A\A) + MC{A\A) 

= M{N -k)C{AAA). 

The inequality for k = 0 proves upper bound MNC{AAA) on |pjv(ao, A) — p 7 v(so, A)|. ◄ 


Proof of Lemma |7j The inequality holds for k = N. For A: = 0,1, 2,..., A — I and any 
a, s' G A we have 


|ITfc(s) - VFfc(s')| < [ Wk+iis)\tAs\s) - tAs\s')\ds < [ |ts(s|s) - ts(s|s')|ds 
J A J A 

Next, we employ a telescopic sum for the multiplicative structure of the density functions in 
the integrand on the right-hand side, to obtain: 


\Wk{s)-Wk{s')\ < 


ds 


]^ti(si|s) - ]^ti(Si|s') 
j—1 n 


E 

i=i 


^E 

i=i' 


Z=1 


J-l 


^=J 


i=l 


i=j+l 


ds 




i=l 


i=i + l 


ds 


n « 

JD, 




<Y^HA\s- s'\\C{D,) = ||s - s'\\Y.H,C{D,) = II s - s'II 
i=i i=i i=i 


◄ 
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► Lemma 10. The entry-wise one-norm and two-norm of square matrices are equivalent: 

n |!$||2 < 11^*111 < n ^/ n \ m \ 2 , 

where n is the dimension of the matrix <i> = S 

Proof of Lemma [inj Define ri(<&) = l Cauchy- 

Schwartz inequality implies that 


Cj($) < y/n ^ |a„f = y/n\\^ei\\2 < v^||«>||2 

\ i=i 

n n 

j=l j=l 

where ei = [1, 0,0,, 0]^. On the other hand for any s = [si, S 2 j • ■ • > Sn]^ with ||s ||2 = 1, 


l^allo = 


lOiiSi + ai2S2 + ... + 
1/2 


^in^r. 


,i=l 


1/2 


< 


^(|a* 


11^ + |Qi2p ' 


.i=l 


1/2 


E 

M = 1 


an 


n 

*j'=i 


As you see here the ratio ||d>||i/||<h|j 2 is bounded from below by the dimension of the 
matrix and also from above by the n^fn. 

► Lemma 11 ([l5]). The maximum singular value of a matrix can be hounded based on its 
sparsity pattern. In particular for any 

||$||2 < . max. 










